-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static HelmRepository OCI #1243
Conversation
Care to elaborate on the reason for this change? |
@makkes I'll add more details in the PR description but this was a quick draft to discuss about the results of this experiment in the community meeting today. |
For transparency's sake I would recommend creating an issue that describes the observed problems with the status field. |
cb28864
to
4230894
Compare
93b3ac1
to
7b02790
Compare
7b02790
to
778b263
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with kstatus and Flux kustomize-controller after the status is reset, will it be stuck forever in waiting? Assuming a change is made in desired state, KC will update the HelmRepo, the API will set the observed generation to -1
, then SC will reset the observed generation to 0 but the actual generation is >1.
I may be missing something but I'm not sure what exactly you mean by kstatus getting stuck in waiting. In fluxcd/notification-controller#540 (comment), we discussed about how empty status is considered ready by kstatus. And I also created fluxcd/flux2#4311 to verify that theory for alerts, providers and helmrepo OCI. The migration process ensure that status is removed from the object if anyone adds it back. Because I'm not sure about the concern, I tried computing the helmrepo object status using kstatus compute function. When the observed generation is -1, kstatus results in inprogress. When the client polls again and the object undergoes migration and status is removed, kstatus checkGeneration() which looks for |
The controller does not remove the status, instead it does |
As I mentioned above, even if this happens, the migration would make it work nicely. Any update will emit an event that'll trigger the migration. But to investigate this behavior further, why don't I seeing any defaulting in the status on updates, I read some docs and tried a few things. As per https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#defaulting-and-nullable, defaulting happens based on the null value of fields. Our API fields are not nullable so a null value is removed unless they have a default value.
Because of this, when a new object is created without any status, the null value of the status is replaced by the configured default value by the API server. Hence, we see status:
observedGeneration: -1
status: {} If I again edit the status to remove status: or add status: null it results in the default to be applied. status:
observedGeneration: -1 Since the HelmRepository OCI migration process sets the status to be |
Ok great, thanks for investigating this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
4777769
to
2ca1247
Compare
Remove the HelmRepositoryOCI reconciler and make HelmRepository of type OCI static. The existing HelmRepository OCI objects are migrated to static object by removing their finalizers and status. New HelmRepository OCI objects go through one time migration to remove the status. These are not reconciled again, unless the type is changed to default. On type switching from HelmRepository default to OCI, the finalizer, status and artifact are removed to make the object static. On switching from OCI to default, a complete reconciliation of HelmRepository takes place to build artifact and add status and finalizer. The HelmRepository .spec.url has a new validation to check the URL scheme. This is to add some validation to HelmRepository OCI since it's not backed by a reconciler for full validation. Add HelmRepositoryOCIMigrationPredicate predicate to detect and allow reconciliation of HelmRepository OCI objects that need migration. The other predicates that filtered the HelmRepository events based on the type have been removed as all the HelmRepositories will now be reconciled by a single reconciler. HelmRepositoryOCIMigrationPredicate readily allows non-OCI objects and only checks if a migration is needed for OCI type object. Add controller tests for different migration scenarios. Signed-off-by: Sunny <[email protected]>
Signed-off-by: Sunny <[email protected]>
With static HelmRepository OCI, the interval become optional. Make interval optional in the API. Introduce getters for interval, in the form of GetRequeueAfter(), and timeout with internal default values. HelmRepository will not have interval and timeout fields unless it's explicitly set. Signed-off-by: Sunny <[email protected]>
Although all the APIs had interval as a required field, when tests objects were created, they had the zero value of interval, which the API server accepts. A zero interval value results in the test objects to reconcile only once when they are created and never reconcile again unless there's an update to the object. Most of the tests worked with this behavior. With HelmRepository removing the interval requirement and adding an internal default, all the HelmRepository objects created in the tests without any interval have a default interval value which results in objects to reconcile automatically if they are not cleaned up after running tests. TestHelmRepositoryReconciler_InMemoryCaching and TestHelmChartReconciler_Reconcile create HelmRepository but doesn't delete it at the end. This leads to a reconciliation of HelmRepository outside of the test in the envtest environment. It just happened to be that the reconciliation time matches with the end of test time. At the end of the test run, the reconcilers receive shutdown signal and any test server, like helmrepository server, are stopped. A HelmRepository reconciliation triggered just before the shutdown signal gets stuck in the reconciliation. HelmRepository can't download the index as the test index server has stopped and hangs for some time. The HelmRepository reconciler worker remains in active state, unlike other reconciler workers that shut down, resulting in the test to timeout at the end. The is fixed by deleting the HelmRepository object created in TestHelmRepositoryReconciler_InMemoryCaching and TestHelmChartReconciler_Reconcile at the end of the test similar to other tests. Signed-off-by: Sunny <[email protected]>
2ca1247
to
1a7adeb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Went over it one more time, thanks @darkowlzz 🏅
To address #1249, this change makes the HelmRepository of type OCI static object, no status.
With this change HelmRepository OCI objects become:
The empty status can't be removed as the API still has it for the default HelmRepository API.
This HelmRepository spec fields
interval
has been made optional with an internal default. Unless an interval is explicitly set, no interval is set on the object. The default value oftimeout
at the API level has been removed and an internal default is added. Unless a timeout is explicitly set, no timeout is set on the object.Due to the default values in the HelmRepository API, the kubernetes API server automatically fills the
.status.observedGeneration
with value-1
. This can be removed using a migration that can be performed by the HelmRepositoryReconciler. Every newly created object goes through the migration to be patched with an empty status.For migration of the existing HelmRepository OCI objects, the HelmRepositoryReconciler performs the migration by removing finalizers and status. When converting the object from HelmRepository default to OCI, HelmRepositoryReconciler migrates the object by also removing any artifact that previously existed.
A new API level validation is added for the
.spec.url
field to ensure onlyhttp
,https
andoci
URL schemes are allowed. This performs some validation to the URL that's lost by removing the status and the HelmRepositoryOCIReconciler.The
HelmRepositoryTypePredicate
has be removed as all the HelmRepositories are now handled by the same reconciler. A new predicateHelmRepositoryOCIMigrationPredicate
is introduced to allow migration reconciliation for objects that are of OCI type but need migration. This ensures that once migrated, these objects are not reconciled again. This predicate readily allows non-OCI HelmRepository events.Also added a test
TestHelmRepositoryReconciler_ociMigration
to verify that the migration of objects are handled correctly in different scenarios.Events related to object migration when the object type is switched:
Fix: #1249